### A BiCMOS circuit family for a O.45um CMOS/BiCMOS Sea of gates

Gerard BOUDON, Dominique PLASSAT, Rene CULLET Robert TRAUET, Daniel MAUCHAUFFEE

**IBM** France, Component Development Laboratory 91102 Corbeil-Essonnes, France

#### - ABSTRACT

A BiCMOS standard cell circuit library is mixed in an existing 300K-CMOS ASIC sea of gates featuring 0.45um Leff FETs. The 3.6V BiCMOS circuits with a 15Ghz NPN, are compatible in voltage levels with the CMOS. Various terminator /Level shifter BiCMOS circuits permit performance from 180ps to 220ps. A Booth / Wallace tree 16x16 Multiplier and an array of 2,700 And-Or's for noise experiment have been implemented on a 86K, 6.7mm Chip.

#### 1 - INTRODUCTION

The Advantages in silicon area and performance of a standard cell over a gate array is larger in BiCMOS than in CMOS. The design of a BiCMOS standard cell mixed with CMOS circuits on an existing 300K CMOS ASIC [1] permits to confirm these differences. The use of 0.45um leff FET's with the mix of CMOS and BiCMOS circuits is possible with a new level shifter circuit limiting the DC power dissipation.

The BiCMOS circuits are used only for high loading above 0.5pF representing an averaging 20% of total circuits on a typical VLSI logic chip. The use of BiCMOS circuits only where they are needed results in a significant speed improvement with a minor penalty on chip density and yield.

Based on the implementation in a multiplier or in a carry look ahead adder, a 25 to 40% data processor cycle time improvement is possible with the selective use in the critical paths of the BiCMOS as buffering stages. It can be pointed out that it is worthwhile to use BiCMOS only in a mix approach. A BiCMOS gate array cannot keep the speed advantage because of a 2X density penalty when compared to CMOS. With the same technology, the wiring capacitance is 1.4X and the RC time constant 2X time higher than with CMOS resulting in a poor advantage for the BiCMOS.

#### 2 - TECHNOLOGY

The BiCMOS technology [2] is based on a 0.8um minimum image four levels of metal CMOS process with the addition of a 15 GHz Ft poly emitter NPN. The FETs have a 0.45um nominal channel length, and a 12nm gate oxide, with a DI-LDD structure in NFETs to handle high voltage breakdown and avoid hot electron effects under 3.6V. A good latch up immunity is obtained with a 20  $\Omega$  /  $\square$  N+ buried layer NPN collector (fig 1).



Fig 1: BiCMOS Process Cross Section



Fig 2 : Full Swing BiCMOS circuit Compatible CMOS

The single N+ doped Polysilicon used as Emitter and FET gate electrode is polycided to reduce the serial resistance and a local interconnects layer with low current capability enhances density. The global wires are routed on first, second and third level of metal with 2.4um pitch. The last metal (4.8um pitch) is used for power distribution and for the wiring of off chip signals I/Os to the solder ball terminals attached to the ceramic package. All the metal lines are aluminum with tungsten vias.

#### 3 - BICMOS LOGIC CIRCUIT SELECTION :

The BiCMOS are used in a sprinkle mode in the CMOS sea of gate. Short channel FETs have their threshold voltage Vgs at 0.5V and the BiCMOS circuits have to be modified to be CMOS compatible in level with rail to rail power supply voltage swing. Terminator / Level Shifter circuits [3] such as Latch, Feedback or parallel CMOS have been compared (figure 2). The full swing Multi-Emitter [4] circuit with these terminators is also part of the library. The circuits are compared to the traditional BiCMOS or the multi-Emitter with resistances.

The choice of the right BiCMOS circuit is guided by criteria's such as speed, power, area and AC undectable defects.

The best circuit for speed is the BiCMOS with feed-back structure made with one inverter controlling FETs which shorts E-B junction of the bipolars at the end of the transitions. The use of this circuit is risky since it has 3 inverters in series which are at the origin of the 1.5ns period oscillations on the transfer curves of figure 3. It has also an higher AC undectable defects level. A stuck at one or zero of the output of the inverter definitely switch off one of the NPN, resulting in a low speed (720ps instead of 220ps) with a circuit being still functional.



Fig 3 : DC Transfer curve for NAND 2 BiCMOS with active feed-back terminator.

For a robust design, circuits are built with the latch terminator shorting the C-E of the NPNs for complex function and with the CMOS in parallel for the invert function which is the mostly used circuit in the library.

Multi-Emitter (ME) : The Multi-Emitter (ME) can be arranged with terminators as above or can be built with resistances. Results of electrical analysis show a lower performance improvement with the full swing ME than with the original ME. The main difference comes from an higher voltage swing. However with an high number of inputs, the ME circuit has been included in the library given 225ps instead of 250ps for NAND with 4 inputs.



FO=1 Nominal 3.6V 60C

# Fig 4: Delay vs Load for 2 input NAND BiCMOS for various circuit options

Good hardware measurements of ring oscillators have confirmed the accuracy of the electrical simulations. In a speed comparison of the various BiCMOS circuit types, the Multi-emitter is only 5% faster than the feedback circuit.

#### 4 - BOOK LAYOUT

The BiCMOS circuits have been designed (Fig 5) with the same constraints than the existing CMOS to permit the mix on the same large sea of cell chip:

Internal cell: 57.6 x 7.2 um2

(24 x 3 M2-M3 wiring tracks)

I/O cell size: 345.6 x 115.2 um2

The wiring of the gate is made with metal 1 between the 2 main power busses. The global wiring between gates is routed vertically on M2 and horizontally on M1 and M3. To ease final chip wiring, ten M1 free channels can be used and the circuits input / output ports can be accessed on different M2 channels.

To improved the decoupling of the noise induced by high spikes of currents, a thin oxide poly Nwell capacitance 0.15pF is laid out with each pair of NPNs and a discrete capacitance of 0.4pF is automatically placed in all the unused cells of the chip. An alternative is to personalized a CMOS gate array background in a decoupling capacitor with the capability for the logic designer to use this gate array for late fixes in the logic [5]. These capacitances are necessary to compensate the low NWELL- Substrate capacitance imposed by the performance of the bipolar.

The saturation of the NPNs due to the voltage drop by an high collector current induces latch-Up. To decrease this effect, an N+ guard ring structure plus a low 20  $\Omega$  /  $\Box$  subcollector N+ have been used for the top NPN which is integrated with the PFETs. For the same reason, the collector /NWELL contact have been placed just under the Vcc power bus running at the first level of metal. The integration of the NPN and the PFETs in the same NWELL bed has permitted to improve the density of the BiCMOS gates.



Fig 5 : Standard cell 2 input NAND with Latch in 5 cells of 7.2um x 57.6um.

#### 5 - 6.7 MM CHIP PROTOTYPE

A 6.7mm chip prototype with 86,528 sea of cells and 168 signal I/Os has been built. It includes a 16x16 bit multiplier with Wallace tree and carry Look Ahead and carry select architecture, an internal noise structure with 2,700 loaded BiCMOS 2x2 And-Or's and a large Library of AC and DC testable BiCMOS/CMOS circuits. The Figure 8 shows the different macros of the chip prototype with the CMOS circuits(empty boxes), the BiCMOS circuit (grey boxes) and the Bipolar in black, illustrating the sprinkle approach. The design have been made with boundary scan LSSD latches to permit the testing on a low cost tester with a reduced number of AC pins [6]. The design have been made automatically with a set of IBM proprietary software CAD tools [7], with the flat logic description as starting point in the design.

#### 16 x 16 Multiplier (Fig 6) :

A high speed Multiplier based on the Booth algorithm, the Wallace tree structure and a 28 bit carry look ahead final adder has been designed. The final adder is composed of an 16 bit carry look ahead circuit in series with a carry select adder for the 12 most significant bits. The Booth encoding has permitted to reduce the number of full adders in the critical path from 14 to 7. The Wallace tree associated with the booth algorithm reduces this number to 4 as shown on the bit slice of figure 6b.

The performance of this multiplier can be measured through a recirculating loop crossing all the array of

adder. A less than 5ns complete multiplication time is obtained which has to be compared to 7.1ns with CMOS only circuits. In ring oscillation mode the 2 inputs X (16 bits multiplicand) and Y (16 bits multiplier) are forced to [111---110] and [000---000] respectively with the most significant bit of the result  $Z_{32}$  reversed and fed back to the 1st and the last bit of Y. Under such conditions a "1" added at the second row of the array, permits to see the propagation of a bit through all the array giving the worst case delay.



Fig 6: 16x16 bit Multiplier architecture

Noise detection : An area of 3.4 by 3.4 mm of the prototype is occupied by a matrix of 2,700 loaded BiCMOS 2x2 And-Or to verify that the noise induced by the switching of the BiCMOS and CMOS circuits, does not disturb quiet circuits. The principal switching activity is with the clock distribution, and the experiment has been made with noisy signal lines connected to polarity hold Latches trying to set them in a wrong state.

#### 6 - CONCLUSION

To get the best benefit in speed and integration a BiCMOS standard cell circuit library have been designed. The circuits and the physical implementation are 100% compatible with an existing 300K-CMOS ASIC sea of gates. The 3.6V BiCMOS circuits with a 15Ghz NPN and 0.45um Leff FETs are running from 180ps to 220ps. The CMOS level compatibility is made with feed back or latch terminator circuit. To demonstrate the efficiency of the Mix approach, a 86K, 6.7mm Chip have been developed. This chip includes a 16x16 Multiplier with a Wallace tree architecture and an array of 2,700 And-Or's for noise experiment.

| TECHNOLOGY          |                     |
|---------------------|---------------------|
| Bipolar             | FET                 |
| Emitter: .8x4.2 u   | ım2 - Leff = 0.45um |
| NPN beta 97         | - Tox = 120 nm      |
|                     | - Transconductances |
| CCB : 23 ff         | NFET 152 mA/V/mm    |
| CCS : 48 ff         | PFET 85 mA/V/mm     |
| Rb : 410 Ω          | - NFET Vth = 0.5V   |
| Ft : 15 Ghz         | - PFET Vth = -0.7V  |
| Interconnections    |                     |
| Metal 1,2,3 (pitch) | 2.4 um              |
| Metal 4 (pitch)     | 4.8 um              |

Table 1

## Acknowledgments:

The authors would like to express their thanks to all the people from Essonnes, Burlington and E.Fishkill laboratories for their contribution. Among them support of T.Bednar, S.Bruneau, M.Combes, P.Damas, A.Dansky, AM.Haen, T.Hook, D.Kemerer, P.Mollier, JP.Nuez, J.Petrovick, S.Posson, JP.Rousseau, P.Tannhof, R.Taylor, F.Wallart, S.Zier, A.Zuckerman, are gratefully acknowledged.

#### REFERENCES:

- [1] J.Petrovick, R.Taylor "A 300K-circuit ASIC Logic Family", ISSCC 90
- [2] E.D Johnson, T.B Hook, J.E Bertch "A high performance 0.5 um BiCMOS technology with 3.3-V CMOS devices" IEEE 1990 Symposium on VLSI Technology, June 90
- [3] G.Boudon P.Mollier F.Wallart P.Tannhof & al "Improved BiCMOS logic circuit with full swing operation" patent 89480044.0
- [4] G.Boudon P.Mollier I.Ong J.P Nuez "Multi Emitter BiCMOS logic circuit Family " IEEE 1990 Symposium on VLSI Technology, June 90
- [5] R.Hornung M.Bonneau B.Waymel E.Gould R.Piro et al. "A versatile VLSI design system for combining Gate array and standard cell circuits on the same chip" 1987 CICC p 245
- [6] R.Bassett et al. "Low cost testing of high density components", IEEE Intern. Test Conference PP 550-557 Aug 89
- [7] J.Panner et al. "A 300K-circuit ASIC Logic Family CAD system", IEEE 1990 CICC



Fig 7: Microphotograph of a 2 way NAND with Feedback after 1st level of metal



Fig 8: 6.7mm Chip prototype with a mix CMOS / BiCMOS